{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c25497f4",
   "metadata": {},
   "source": [
    "**NOTE:** This notebook was written in 2024, and is not guaranteed to work with the latest version of llama-index. It is presented here for reference only.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "42c463c6-a36e-41ed-9897-0b7b25417deb",
   "metadata": {},
   "source": [
    "![Slide One](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/1.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "08f96a7c-6854-4421-ae55-7206ca265382",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Two](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/2.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4bf4cac4-4a35-4eb1-a2a2-39388ee69030",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Three](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/3.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a0eace5e-b710-4879-822b-2fe88257bec2",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Four](https://d3ddy8balm3goa.cloudfront.net/vector-oss-tools/draft/4.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "86b21a33-f05f-4cb0-81a4-1d7b26d4ed10",
   "metadata": {},
   "source": [
    "## Observability: Arize AI\n",
    "\n",
    "Follow the quickstart guide found [here](https://github.com/Arize-ai/openinference/tree/main/python/instrumentation/openinference-instrumentation-llama-index#quickstart)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d961c72b-12ac-4134-b801-093f61f2766f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip available: \u001b[0m\u001b[31;49m22.3.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.0\u001b[0m\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpython3.11 -m pip install --upgrade pip\u001b[0m\n",
      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
   "source": [
    "%pip install --upgrade \\\n",
    "    openinference-instrumentation-llama-index \\\n",
    "    opentelemetry-sdk \\\n",
    "    opentelemetry-exporter-otlp \\\n",
    "    \"opentelemetry-proto>=1.12.0\" \\\n",
    "    arize-phoenix -q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "524c697e-4b8d-4893-a6eb-1278e8d5143e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import os\n",
    "\n",
    "get_ipython().system = os.system\n",
    "\n",
    "!python -m phoenix.server.main serve > arize.log 2>&1 &"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "69298cc6-c951-45f9-addb-3203d14e6191",
   "metadata": {},
   "outputs": [],
   "source": [
    "from openinference.instrumentation.llama_index import LlamaIndexInstrumentor\n",
    "from opentelemetry.exporter.otlp.proto.http.trace_exporter import (\n",
    "    OTLPSpanExporter,\n",
    ")\n",
    "from opentelemetry.sdk import trace as trace_sdk\n",
    "from opentelemetry.sdk.trace.export import SimpleSpanProcessor\n",
    "\n",
    "endpoint = \"http://127.0.0.1:6006/v1/traces\"\n",
    "tracer_provider = trace_sdk.TracerProvider()\n",
    "tracer_provider.add_span_processor(\n",
    "    SimpleSpanProcessor(OTLPSpanExporter(endpoint))\n",
    ")\n",
    "\n",
    "LlamaIndexInstrumentor().instrument(tracer_provider=tracer_provider)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "20ff8c48-a64c-4dea-b54b-6c46fc7b6bec",
   "metadata": {},
   "source": [
    "## Example: A Gang of LLMs Tell A Story"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a450e92a-de33-44c3-854e-05ed1ef31b3c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# INSTALL LLM INTEGRATION PACKAGES\n",
    "%pip install llama-index-llms-openai -q\n",
    "%pip install llama-index-llms-cohere -q\n",
    "%pip install llama-index-llms-anthropic -q\n",
    "%pip install llama-index-llms-mistralai -q\n",
    "%pip install llama-index-vector-stores-qdrant -q\n",
    "%pip install llama-index-agent-openai -q\n",
    "%pip install llama-index-agent-introspective -q\n",
    "%pip install google-api-python-client -q\n",
    "%pip install llama-index-program-openai -q\n",
    "%pip install llama-index-readers-file -q\n",
    "\n",
    "# INSTALL OTHER DEPS\n",
    "%pip install pyvis -q"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3b29f3ca-9ac2-4fd4-80cf-d5e3e2d4f83d",
   "metadata": {},
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "74fd3581-e029-45f9-805d-6d3b07cf8651",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llms.anthropic import Anthropic\n",
    "from llama_index.llms.cohere import Cohere\n",
    "from llama_index.llms.mistralai import MistralAI\n",
    "from llama_index.llms.openai import OpenAI\n",
    "\n",
    "anthropic_llm = Anthropic(model=\"claude-3-opus-20240229\")\n",
    "cohere_llm = Cohere(model=\"command\")\n",
    "mistral_llm = MistralAI(model=\"mistral-large-latest\")\n",
    "openai_llm = OpenAI(model=\"gpt-4o\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "15685a8e-6efd-4f38-9fed-a2f8f4a0c5a0",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In a world where pizza toppings knew no bounds, one daring chef created a masterpiece: the \"Everything But The...\n"
     ]
    }
   ],
   "source": [
    "theme = \"over-the-top pizza toppings\"\n",
    "start = anthropic_llm.complete(\n",
    "    f\"Please start a random story around {theme}. Limit your response to 20 words.\"\n",
    ")\n",
    "print(start)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e41ddbed-d684-43a5-838e-6d0c90fa9942",
   "metadata": {},
   "outputs": [],
   "source": [
    "middle = cohere_llm.complete(\n",
    "    f\"Please continue the provided story. Limit your response to 20 words.\\n\\n {start.text}\"\n",
    ")\n",
    "climax = mistral_llm.complete(\n",
    "    f\"Please continue the attached story. Your part is the climax of the story, so make it exciting! Limit your response to 20 words.\\n\\n {start.text + middle.text}\"\n",
    ")\n",
    "ending = openai_llm.complete(\n",
    "    f\"Please continue the attached story. Your part is the end of the story, so wrap it up! Limit your response to 20 words.\\n\\n {start.text + middle.text + climax.text}\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "15139945-c25b-49f3-b00b-d8d8515f1812",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In a world where pizza toppings knew no bounds, one daring chef created a masterpiece: the \"Everything But The...\n",
      "\n",
      " ...Nay, Even The Kitchen Sink\" pizza. And so, the crazy culinary adventure began. \n",
      "\n",
      "Suddenly, the oven exploded, raining toppings, revealing a hidden treasure map beneath the pizza chaos!\n",
      "\n",
      "The chef, astonished, followed the map, discovering a vault of ancient recipes, forever changing the culinary world. The end.\n"
     ]
    }
   ],
   "source": [
    "# let's see our story!\n",
    "print(f\"{start}\\n\\n{middle}\\n\\n{climax}\\n\\n{ending}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8dd386fe-e1ca-4f04-abc7-fe93d018eadb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In a world where pizza toppings knew no bounds, one daring chef created a masterpiece: the \"Everything But The...\n",
      "\n",
      " ...Kitchen Sink\" pizza: a pie topped with every imaginable ingredient. The pizza revolution was underway! \n",
      "\n",
      "Suddenly, the pizza levitated, glowing, revealing an alien ingredient. The world watched, awestruck, as extraterrestrial cuisine was unveiled!\n",
      "\n",
      "The alien ingredient united humanity, sparking global peace. The chef's creation became a symbol of unity, forever changing the world.\n"
     ]
    }
   ],
   "source": [
    "# let's see our story!\n",
    "print(f\"{start}\\n\\n{middle}\\n\\n{climax}\\n\\n{ending}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c4e2ea2d-9505-4a65-bb2c-1b21671e5f6e",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Five](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/5.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dc1cb754-ea9a-4144-97a4-a08460b8ee9a",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Six](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/6.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "229885d5-70d3-41c7-a156-f3a0874f7576",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Seven](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/7.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1a9acc43-7e6e-40e6-9bcb-cf695f718baf",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Eight](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/8.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0d32def2-75ce-46b4-aef9-adbd21a47e58",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Nine](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/9.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "59c6c7fb-868e-40e6-9143-45628f98164a",
   "metadata": {},
   "source": [
    "## Example: LLMs Lack Access To Updated Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bc83246d-7ea4-4ed6-81db-996679d1c1f9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# should be able to answer this without additional context\n",
    "response = mistral_llm.complete(\n",
    "    \"What can you tell me about Georgian Partners?\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5c0339eb-14d5-4d60-9c5d-4b9432f280b1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Georgian Partners is a growth equity firm that invests in business software companies. They are based in Toronto, Canada, and focus on investments in high-growth companies that use applied artificial intelligence, trust, and conversational AI to disrupt traditional markets.\n",
      "\n",
      "Georgian Partners was founded in 2008 and typically invests in Series B or later rounds. They provide capital and strategic support to help companies scale and achieve their growth objectives. Their investments range from $5 million to $50 million and they often take a minority stake in the companies they invest in.\n",
      "\n",
      "Some of the sectors Georgian Partners focuses on include security, financial technology, internet and information services, and marketing and sales software. They have invested in a number of successful companies, including Shopify, FreshBooks, and Tealium.\n",
      "\n",
      "In addition to providing capital, Georgian Partners also offers its portfolio companies access to its in-house expertise in areas such as artificial intelligence, data science, and go-to-market strategies. This value-add approach helps their portfolio companies to grow and succeed.\n"
     ]
    }
   ],
   "source": [
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8a1a0204-a42b-4132-8029-0cf86c0e2e90",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "I'm an AI and I don't have real-time access to databases or the internet to provide the exact percentage of customers who participated in the 2022 ESG survey from the 2022 Annual Purpose Report. You would need to check the report directly for that information. If you provide the data from the report, I can help analyze or interpret it.\n"
     ]
    }
   ],
   "source": [
    "# a query that needs Annual Report 2022\n",
    "query = \"According to the 2022 Annual Purpose Report, what percentage of customers participated in 2022 ESG survey?\"\n",
    "\n",
    "response = mistral_llm.complete(query)\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "48fb1cb1-b078-4004-9ec9-035a354b37ef",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Ten](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/10.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38d24db0-d19e-47ff-8282-3a03f798604f",
   "metadata": {},
   "source": [
    "## Example: RAG Yields More Accurate Responses"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8cda0969-7314-4eef-8ddd-db0ae80ae219",
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir data\n",
    "!wget \"https://cdn.pathfactory.com/assets/preprocessed/10580/b81532f1-95f3-4a1c-ba0d-80a56726e833/b81532f1-95f3-4a1c-ba0d-80a56726e833.pdf\" -O \"./data/gp-purpose-report-2022.pdf\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8a773f87-64c5-40da-bb3e-3db96d4d6230",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import SimpleDirectoryReader, VectorStoreIndex\n",
    "\n",
    "# build an in-memory RAG over the Annual Report in 4 lines of code\n",
    "loader = SimpleDirectoryReader(input_dir=\"./data\")\n",
    "documents = loader.load_data()\n",
    "index = VectorStoreIndex.from_documents(documents)\n",
    "rag = index.as_query_engine(llm=mistral_llm)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "207e8d84-d6d5-45b7-9f8c-b21ba8658a2e",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = rag.query(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d76260a1-ddaa-48ee-acc6-6b99a6e0d5f6",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In the 2022 survey cycle, there was a participation rate of 95% across the portfolio. This participation rate is based on data self-disclosed voluntarily by 39 portfolio companies where an active board seat was held as of December 31, 2022.\n"
     ]
    }
   ],
   "source": [
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "06f796b1-cd3f-41a0-81e3-5326bf409889",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Eleven](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/11.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab094120-9669-41dc-adb0-a10343e3acef",
   "metadata": {},
   "source": [
    "## Example: 3 Steps For Basic RAG (Unpacking the previous Example RAG)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2c02b0a2-dfe4-495c-8810-21998d35b280",
   "metadata": {},
   "source": [
    "### Step 1: Build Knowledge Store"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f108a3d4-4e2a-4e3d-a20b-1f3e122cfd78",
   "metadata": {},
   "outputs": [],
   "source": [
    "\"\"\"Load the data.\n",
    "\n",
    "With llama-index, before any transformations are applied,\n",
    "data is loaded in the `Document` abstraction, which is\n",
    "a container that holds the text of the document.\n",
    "\"\"\"\n",
    "\n",
    "from llama_index.core import SimpleDirectoryReader\n",
    "\n",
    "loader = SimpleDirectoryReader(input_dir=\"./data\")\n",
    "documents = loader.load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3dba1e90-4cd7-4394-a8ba-e7ed52c35729",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"2\\n PURPOSE ANNUAL REPORT | 2022Table of Contents\\nINTRODUCTION\\nSECTION 1: GEORGIAN’S PURPOSE\\nOur Purpose  ...............................................................................................................................................  5\\nPurpose Spotlight: Gradient Spaces  .............................................................................................  6\\nSECTION 2: ACCELERATING OUR PURPOSE\\nIntroducing Our Thesis on Product-led Purpose  ....................................................................  8\\nProduct-led Purpose Spotlight: Oyster  ........................................................................................  9\\nSECTION 3: COLLABORATING WITH STAKEHOLDERS\\nMeasuring What Matters: Georgian's 2022 ESG Survey  ....................................................  11\\nCybersecurity Initiative for Board Reporting ............................................................................  16\\nSECTION 4: ENHANCING OUR ESG PRACTICES\\nAdopting Global Standards  ...............................................................................................................  18\\nGeorgian’s Current Talent Pool  .......................................................................................................  19\\nSustainability at Georgian  ..................................................................................................................  21\\nLOOKING AHEAD\\n2 PURPOSE ANNUAL REPORT | 2022\""
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# if you want to see what the text looks like\n",
    "documents[1].text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4a2c764c-ef6b-41ce-9ba4-c8a54ac75990",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "WARNING:root:Payload indexes have no effect in the local Qdrant. Please use server Qdrant if you need payload indexes.\n"
     ]
    }
   ],
   "source": [
    "\"\"\"Chunk, Encode, and Store into a Vector Store.\n",
    "\n",
    "To streamline the process, we can make use of the IngestionPipeline\n",
    "class that will apply your specified transformations to the\n",
    "Document's.\n",
    "\"\"\"\n",
    "\n",
    "from llama_index.core.ingestion import IngestionPipeline\n",
    "from llama_index.core.node_parser import SentenceSplitter\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "from llama_index.vector_stores.qdrant import QdrantVectorStore\n",
    "import qdrant_client\n",
    "\n",
    "client = qdrant_client.QdrantClient(location=\":memory:\")\n",
    "vector_store = QdrantVectorStore(client=client, collection_name=\"test_store\")\n",
    "\n",
    "pipeline = IngestionPipeline(\n",
    "    transformations=[\n",
    "        SentenceSplitter(),\n",
    "        OpenAIEmbedding(),\n",
    "    ],\n",
    "    vector_store=vector_store,\n",
    ")\n",
    "_nodes = pipeline.run(documents=documents, num_workers=4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "53303b32-eb64-40e5-9589-c9cb81378631",
   "metadata": {},
   "outputs": [],
   "source": [
    "\"\"\"Create a llama-index... wait for it... Index.\n",
    "\n",
    "After uploading your encoded documents into your vector\n",
    "store of choice, you can connect to it with a VectorStoreIndex\n",
    "which then gives you access to all of the llama-index functionality.\n",
    "\"\"\"\n",
    "\n",
    "from llama_index.core import VectorStoreIndex\n",
    "\n",
    "index = VectorStoreIndex.from_vector_store(vector_store=vector_store)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "db2c0528-875a-4bf5-9372-358af9ae3d4e",
   "metadata": {},
   "source": [
    "### Step 2: Retrieve Against A Query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "48d6c333-06fb-4bc5-ba86-f4712d25b7fa",
   "metadata": {},
   "outputs": [],
   "source": [
    "\"\"\"Retrieve relevant documents against a query.\n",
    "\n",
    "With our Index ready, we can now query it to\n",
    "retrieve the most relevant document chunks.\n",
    "\"\"\"\n",
    "\n",
    "retriever = index.as_retriever(similarity_top_k=2)\n",
    "retrieved_nodes = retriever.retrieve(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7ab5eacd-0738-4f1f-94b7-af7872c8cd3f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[NodeWithScore(node=TextNode(id_='5eb225cb-741a-4e9b-b425-a4ec362375bd', embedding=None, metadata={'page_label': '11', 'file_name': 'gp-purpose-report-2022.pdf', 'file_path': '/Users/nerdai/talks/2024/georgian-genai-bootcamp/data/gp-purpose-report-2022.pdf', 'file_type': 'application/pdf', 'file_size': 12593149, 'creation_date': '2024-06-19', 'last_modified_date': '2023-09-06'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='881f1875-da9e-489e-ba53-9fcc15b4cd50', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'page_label': '11', 'file_name': 'gp-purpose-report-2022.pdf', 'file_path': '/Users/nerdai/talks/2024/georgian-genai-bootcamp/data/gp-purpose-report-2022.pdf', 'file_type': 'application/pdf', 'file_size': 12593149, 'creation_date': '2024-06-19', 'last_modified_date': '2023-09-06'}, hash='dc8cc0c031db311dc0817207a461c66c70e194af374bfe04f616acc9e5fe2499')}, text=\"11 PURPOSE ANNUAL REPORT | 2022Measuring What Matters: Georgian's 2022 ESG Survey\\nIn 2020, we launched our inaugural environmental, social and governance (ESG) survey.  \\nThis voluntary questionnaire collects data on material ESG topics from our customers that  \\nchoose to participate. \\nGeorgian’s ESG data collection is based on several global and industry frameworks, ensuring  \\nthat our tracking and reporting are up-to-date with evolving standards and regulations.\\nKEY STANDARDS THE ESG SURVEY IS BASED ON\\nWE TRACK 80+ UNIQUE METRICS\\n• Board composition and \\nboard diversity\\n• ESG oversight\\n• Governance and \\ncompliance• Cybersecurity, privacy\\n• Employee well-being and \\nemployee engagement\\n• Diversity, inclusion, \\nbelonging and equity• Employee turnover\\n• Labor conditions\\n• GHG emissions and \\nsustainability initiatives\\nOver the past three surveys, survey participation has \\nincreased. For the 2022 survey cycle,  \\nwe had 95% participation across our portfolio1.\\n1 Georgian’s 2022 ESG survey is based on data that is self-disclosed, voluntarily, by 39 Georgian portfolio companies where we held an active board seat as of December 31, 2022. While \\nwe strive to ensure the accuracy and reliability of the information presented, it is important to note that the data is based on self-reporting by the respective companies.  \\nAs such, we cannot guarantee the completeness, veracity, or timeliness of the disclosed information.ESG Data  \\nConvergence  \\nInitiativeSustainability \\nAccounting  \\nStandards BoardGlobal Reporting  \\nInitiativeTask Force on   \\nClimate-related \\nFinancial Disclosures\\n2022 survey participation \\nacross our portfolio 95%\", start_char_idx=0, end_char_idx=1644, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n'), score=0.8778799636598817),\n",
       " NodeWithScore(node=TextNode(id_='0c9db3cf-6bf7-4b9c-b9da-66d3b40baf39', embedding=None, metadata={'page_label': '14', 'file_name': 'gp-purpose-report-2022.pdf', 'file_path': '/Users/nerdai/talks/2024/georgian-genai-bootcamp/data/gp-purpose-report-2022.pdf', 'file_type': 'application/pdf', 'file_size': 12593149, 'creation_date': '2024-06-19', 'last_modified_date': '2023-09-06'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='79cfb29a-d270-4e16-b979-76921040d2dd', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'page_label': '14', 'file_name': 'gp-purpose-report-2022.pdf', 'file_path': '/Users/nerdai/talks/2024/georgian-genai-bootcamp/data/gp-purpose-report-2022.pdf', 'file_type': 'application/pdf', 'file_size': 12593149, 'creation_date': '2024-06-19', 'last_modified_date': '2023-09-06'}, hash='42b95df9955077dfd1946ff1e6b7bef5060300e4a5d80d83ed3a8cc7af53687a')}, text='14\\n PURPOSE ANNUAL REPORT | 2022In 2022, we identified several areas where improvements could be made in an effort to enhance ESG performance.\\nWorkplace Belonging Initiatives\\nResearch1 indicates that employee belonging is linked to retention and productivity. In 2022, some Georgian \\ncustomers already reported offering programs that create a sense of belonging, while other customers reported \\noffering elements of a similar program. Volunteering was part of fostering belonging for many customers, with 61% \\nof participating Georgian customers reporting paid time off (PTO) for employee volunteering. Meanwhile, in 2022, \\n25% of participating Georgian customers reported offering employee mentorship. Approximately 50% of Georgian \\nparticipating customers report implementing formal initiatives to support employee resource groups and/or \\ncorporate social responsibility. \\n44%\\n25%61%\\n47%60%\\n40%\\n20%\\n0%\\nEmployee \\nResource GroupsMentorship \\nProgramPTO for Employee \\nVolunteeringFormal Volunteer and \\nDonation Program\\nBoard ESG Oversight\\nIn our view, board-level ESG reporting is a growing best \\npractice that supports oversight of risk and opportunities \\nfor value creation. According to the National Association \\nof Corporate Directors2, 39% of private and 62% of public \\ncompany boards review ESG performance. In 2022, 13% \\nof participating Georgian customers indicated that they \\nprovide board-level ESG reporting, with another 22% of \\nparticipating Georgian portfolio companies reporting that \\nthey plan to implement updates in the next 12 months. As \\ncustomers develop their board reporting capabilities while \\nthey scale, Georgian seeks to support these companies as \\nthey formalize their processes. See page 16  for an example \\nof our work with The Collective, where we facilitated \\na working group of customers to advance their cyber \\nreporting processes to boards. 40%\\n30%\\n25%\\n20%\\n15%\\n10%\\n5%\\n0%13%22% 39%\\nProvides board-level \\nESG reporting\\nPlans to implement vs. \\nwill implement board-\\nlevel ESG reporting in \\n12 months\\nIndustry benchmark\\n1 The Value of Belonging at Work , HBR, 2019\\n2 “Private Company Board Practices & Oversight Survey ” NACD, 2022. Comparable benchmark to Georgian companies, respondents are directors \\nfrom private company boards, primarily in North America, with half of the companies generating less than US$250 million in revenue .', start_char_idx=0, end_char_idx=2365, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n'), score=0.8566923801656308)]"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# to view the retrieved nodes\n",
    "retrieved_nodes"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "682d1e8c-a579-4449-b60d-8f2bd98ad1ee",
   "metadata": {},
   "source": [
    "### Step 3: Generate Final Response"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "08e65509-ab2c-49bd-9dc1-af38e389b40c",
   "metadata": {},
   "outputs": [],
   "source": [
    "\"\"\"Context-Augemented Generation.\n",
    "\n",
    "With our Index ready, we can create a QueryEngine\n",
    "that handles the retrieval and context augmentation\n",
    "in order to get the final response.\n",
    "\"\"\"\n",
    "\n",
    "query_engine = index.as_query_engine(llm=mistral_llm)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "31d94f08-db5b-452b-bb24-cf087d6c5307",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Context information is below.\n",
      "---------------------\n",
      "{context_str}\n",
      "---------------------\n",
      "Given the context information and not prior knowledge, answer the query.\n",
      "Query: {query_str}\n",
      "Answer: \n"
     ]
    }
   ],
   "source": [
    "# to inspect the default prompt being used\n",
    "print(\n",
    "    query_engine.get_prompts()[\n",
    "        \"response_synthesizer:text_qa_template\"\n",
    "    ].default_template.template\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1439a11f-cba5-4b80-8c4c-edbd43026b93",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In the 2022 survey cycle, there was a participation rate of 95% across the portfolio. This participation rate is based on data self-disclosed voluntarily by 39 portfolio companies where an active board seat was held as of December 31, 2022.\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(query)\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "06113973-f7f5-42fe-96bf-2a14be1b6c84",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Twelve](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/12.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "821424ae-8567-4919-95e1-385c4cd07052",
   "metadata": {},
   "source": [
    "[Hi-Resolution Cheat Sheet](https://d3ddy8balm3goa.cloudfront.net/llamaindex/rag-cheat-sheet-final.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a990db36-125c-461a-a4c8-bb50c0cf5f7f",
   "metadata": {},
   "source": [
    "## Example: Graph RAG"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c3f71807-8fc4-4d29-b603-4db763daa741",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/nerdai/.pyenv/versions/3.11.3/envs/georgian-genai-bootcamp/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n",
      "Parsing nodes: 100%|█████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 145.41it/s]\n",
      "Extracting paths from text: 100%|█████████████████████████████████████████████████████████| 10/10 [00:07<00:00,  1.37it/s]\n",
      "Extracting implicit paths: 100%|███████████████████████████████████████████████████████| 10/10 [00:00<00:00, 39494.39it/s]\n",
      "Generating embeddings: 100%|████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.44it/s]\n",
      "Generating embeddings: 100%|████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00,  3.14it/s]\n"
     ]
    }
   ],
   "source": [
    "from llama_index.core import PropertyGraphIndex\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "\n",
    "index = PropertyGraphIndex.from_documents(\n",
    "    documents[10:20],\n",
    "    llm=openai_llm,\n",
    "    embed_model=OpenAIEmbedding(model_name=\"text-embedding-ada-002\"),\n",
    "    show_progress=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "04c8105a-096f-4b6d-8a7b-47c3703a2f9b",
   "metadata": {},
   "outputs": [],
   "source": [
    "index.property_graph_store.save_networkx_graph(name=\"./kg.html\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bd5b7f9f-de05-4b86-9b03-3e37ce4c2c70",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Georgian -> Launched -> Inaugural esg survey\n",
      "2022 survey cycle -> Had -> 95% participation\n",
      "Survey participation -> Increased -> Past three surveys\n",
      "Higher purpose annual report -> Is -> 17 purpose annual report\n"
     ]
    }
   ],
   "source": [
    "retriever = index.as_retriever(\n",
    "    include_text=False,  # include source text, default True\n",
    ")\n",
    "\n",
    "nodes = retriever.retrieve(query)\n",
    "\n",
    "for node in nodes:\n",
    "    print(node.text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "606dce46-5e85-4ef0-bd1d-dddd16789e4e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "According to the 2022 Annual Purpose Report, 95% of customers participated in the 2022 ESG survey.\n"
     ]
    }
   ],
   "source": [
    "query_engine = index.as_query_engine(\n",
    "    include_text=True,\n",
    ")\n",
    "\n",
    "response = query_engine.query(query)\n",
    "\n",
    "print(str(response))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a3bcd491-b438-48e8-ade3-f5219fe1c308",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Thirteen](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/13.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f024428-379c-4c0d-bbdd-5909e569a5b2",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Fourteen](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/14.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "716216ec-3d15-405b-ac47-67d454618516",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Fifteen](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/15.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "24fdb303-d471-42c4-8eb6-2e82dc631fc5",
   "metadata": {},
   "source": [
    "## Example: Agent Ingredients — Tool Use\n",
    "\n",
    "**Note:** LLMs are not very good pseudo-random number generators (see my [LinkedIn post](https://www.linkedin.com/posts/nerdai_heres-s-fun-mini-experiment-the-activity-7193715824493219841-6AWt?utm_source=share&utm_medium=member_desktop) about this)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9f9ac3f5-610a-4006-ad8c-41269230e680",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.tools import FunctionTool\n",
    "from llama_index.agent.openai import OpenAIAgent\n",
    "from numpy import random\n",
    "from typing import List"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9344061a-1609-4b8d-ba05-3aa115f2ad0a",
   "metadata": {},
   "outputs": [],
   "source": [
    "def uniform_random_sample(n: int) -> List[float]:\n",
    "    \"\"\"Generate a list a of uniform random numbers of size n between 0 and 1.\"\"\"\n",
    "    return random.rand(n).tolist()\n",
    "\n",
    "\n",
    "rs_tool = FunctionTool.from_defaults(fn=uniform_random_sample)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8ec77eb9-270f-4c4e-badd-ee8378f1d72b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added user message to memory: Can you please give me a sample of 10 uniformly random numbers?\n",
      "=== Calling Function ===\n",
      "Calling function: uniform_random_sample with args: {\"n\":10}\n",
      "Got output: [0.40940474059026144, 0.8398091516648358, 0.09192236283686928, 0.692139392648083, 0.7389192707142035, 0.22912923301893906, 0.8523996740969749, 0.4647939544498805, 0.07436044055501878, 0.0008621837461605386]\n",
      "========================\n",
      "\n",
      "Here is a sample of 10 uniformly random numbers:\n",
      "\n",
      "1. 0.4094\n",
      "2. 0.8398\n",
      "3. 0.0919\n",
      "4. 0.6921\n",
      "5. 0.7389\n",
      "6. 0.2291\n",
      "7. 0.8524\n",
      "8. 0.4648\n",
      "9. 0.0744\n",
      "10. 0.0009\n"
     ]
    }
   ],
   "source": [
    "agent = OpenAIAgent.from_tools([rs_tool], llm=openai_llm, verbose=True)\n",
    "\n",
    "response = agent.chat(\n",
    "    \"Can you please give me a sample of 10 uniformly random numbers?\"\n",
    ")\n",
    "print(str(response))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ca6f2caa-eb71-4248-997c-7cf6906ff03f",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Sixteen](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/16.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "122955d6-9ea7-455e-853c-977005f0f003",
   "metadata": {},
   "source": [
    "## Example: Agent Ingredients — Composable Memory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0e607609-93b2-499d-bfee-62e9dcae7db8",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.memory import (\n",
    "    VectorMemory,\n",
    "    SimpleComposableMemory,\n",
    "    ChatMemoryBuffer,\n",
    ")\n",
    "from llama_index.core.agent import FunctionCallingAgentWorker"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "61538923-8796-482f-b7a8-8b0768c7d762",
   "metadata": {},
   "outputs": [],
   "source": [
    "vector_memory = VectorMemory.from_defaults(\n",
    "    vector_store=None,  # leave as None to use default in-memory vector store\n",
    "    embed_model=OpenAIEmbedding(),\n",
    "    retriever_kwargs={\"similarity_top_k\": 2},\n",
    ")\n",
    "\n",
    "chat_memory_buffer = ChatMemoryBuffer.from_defaults()\n",
    "\n",
    "composable_memory = SimpleComposableMemory.from_defaults(\n",
    "    primary_memory=chat_memory_buffer,\n",
    "    secondary_memory_sources=[vector_memory],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "70bbf4c6-9ef4-4eb8-be72-08d52249142f",
   "metadata": {},
   "outputs": [],
   "source": [
    "def multiply(a: int, b: int) -> int:\n",
    "    \"\"\"Multiply two integers and returns the result integer.\"\"\"\n",
    "    return a * b\n",
    "\n",
    "\n",
    "def mystery(a: int, b: int) -> int:\n",
    "    \"\"\"Mystery function on two numbers.\"\"\"\n",
    "    return a**2 - b**2\n",
    "\n",
    "\n",
    "multiply_tool = FunctionTool.from_defaults(fn=multiply)\n",
    "mystery_tool = FunctionTool.from_defaults(fn=mystery)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3f8d9a6b-b67e-4f71-b92e-b8f0f968ad47",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent_worker = FunctionCallingAgentWorker.from_tools(\n",
    "    [multiply_tool, mystery_tool], llm=openai_llm, verbose=True\n",
    ")\n",
    "agent = agent_worker.as_agent(memory=composable_memory)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d32e9a73-0d6d-451f-9796-f99e7dbb26be",
   "metadata": {},
   "source": [
    "### Execute some function calls"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "34b83e55-6f48-425a-b5c9-5349fe89b160",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added user message to memory: What is the mystery function on 5 and 6?\n",
      "=== Calling Function ===\n",
      "Calling function: mystery with args: {\"a\": 5, \"b\": 6}\n",
      "=== Function Output ===\n",
      "-11\n",
      "=== LLM Response ===\n",
      "The result of the mystery function on 5 and 6 is -11.\n"
     ]
    }
   ],
   "source": [
    "response = agent.chat(\"What is the mystery function on 5 and 6?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "00f3be76-3b1b-41dc-abc0-9c6213445a0f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added user message to memory: What happens if you multiply 2 and 3?\n",
      "=== Calling Function ===\n",
      "Calling function: multiply with args: {\"a\": 2, \"b\": 3}\n",
      "=== Function Output ===\n",
      "6\n",
      "=== LLM Response ===\n",
      "Multiplying 2 and 3 gives you 6.\n"
     ]
    }
   ],
   "source": [
    "response = agent.chat(\"What happens if you multiply 2 and 3?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2b2e22f8-f59c-4449-b39a-dcba8049a609",
   "metadata": {},
   "source": [
    "### New Agent Session"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "74cff862-ad5a-40fd-a199-99b2a61fbcf3",
   "metadata": {},
   "source": [
    "#### Without memory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8d029d3b-97a5-44db-9355-d937143b40bf",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent_worker = FunctionCallingAgentWorker.from_tools(\n",
    "    [multiply_tool, mystery_tool], llm=openai_llm, verbose=True\n",
    ")\n",
    "agent_without_memory = agent_worker.as_agent()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8eabb72f-3681-42e3-ad1c-55c02a2bc950",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.\n",
      "=== LLM Response ===\n",
      "I don't have the ability to recall past interactions or outputs. However, I can recompute the result of the mystery function on 5 and 6 for you. Would you like me to do that?\n"
     ]
    }
   ],
   "source": [
    "response = agent_without_memory.chat(\n",
    "    \"What was the output of the mystery function on 5 and 6 again? Don't recompute.\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e40632a-47c2-46c4-8df8-818f7eb36aef",
   "metadata": {},
   "source": [
    "#### With memory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6843bf31-8930-40aa-9df2-6ed2434b71f1",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = OpenAI(model=\"gpt-3.5-turbo-0613\")\n",
    "agent_worker = FunctionCallingAgentWorker.from_tools(\n",
    "    [multiply_tool, mystery_tool], llm=openai_llm, verbose=True\n",
    ")\n",
    "composable_memory = SimpleComposableMemory.from_defaults(\n",
    "    primary_memory=ChatMemoryBuffer.from_defaults(),\n",
    "    secondary_memory_sources=[\n",
    "        vector_memory.copy(\n",
    "            deep=True\n",
    "        )  # using a copy here for illustration purposes\n",
    "        # later will use original vector_memory again\n",
    "    ],\n",
    ")\n",
    "agent_with_memory = agent_worker.as_agent(memory=composable_memory)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7c231bbd-c629-4376-a97b-104ef867536e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "agent_with_memory.chat_history  # an empty chat history"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "33e98b79-fb92-4b9c-872f-3d94d589f883",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.\n",
      "=== LLM Response ===\n",
      "The output of the mystery function on 5 and 6 was -11.\n"
     ]
    }
   ],
   "source": [
    "response = agent_with_memory.chat(\n",
    "    \"What was the output of the mystery function on 5 and 6 again? Don't recompute.\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "001c8637-3561-4dd0-a128-9573986d6a3c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added user message to memory: What was the output of the multiply function on 2 and 3 again? Don't recompute.\n",
      "=== LLM Response ===\n",
      "The output of the multiply function on 2 and 3 was 6.\n"
     ]
    }
   ],
   "source": [
    "response = agent_with_memory.chat(\n",
    "    \"What was the output of the multiply function on 2 and 3 again? Don't recompute.\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "27591b78-bc00-48b6-9e8a-8ce691fa9cb1",
   "metadata": {},
   "source": [
    "#### Under the hood\n",
    "\n",
    "Calling `.chat()` will invoke `memory.get()`. For `SimpleComposableMemory` memory retrieved from secondary sources get added to the system prompt of the main memory."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a3951a2e-8e6c-4cc9-a328-41cbfe013586",
   "metadata": {},
   "outputs": [],
   "source": [
    "composable_memory = SimpleComposableMemory.from_defaults(\n",
    "    primary_memory=ChatMemoryBuffer.from_defaults(),\n",
    "    secondary_memory_sources=[\n",
    "        vector_memory.copy(\n",
    "            deep=True\n",
    "        )  # copy for illustrative purposes to explain what\n",
    "        # happened under the hood from previous subsection\n",
    "    ],\n",
    ")\n",
    "agent_with_memory = agent_worker.as_agent(memory=composable_memory)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0072e2b8-8c46-4258-828f-24ce9ceadd65",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "system: You are a helpful assistant.\n",
      "\n",
      "Below are a set of relevant dialogues retrieved from potentially several memory sources:\n",
      "\n",
      "=====Relevant messages from memory source 1=====\n",
      "\n",
      "\tUSER: What is the mystery function on 5 and 6?\n",
      "\tASSISTANT: None\n",
      "\tTOOL: -11\n",
      "\tASSISTANT: The result of the mystery function on 5 and 6 is -11.\n",
      "\n",
      "=====End of relevant messages from memory source 1======\n",
      "\n",
      "This is the end of the retrieved message dialogues.\n"
     ]
    }
   ],
   "source": [
    "print(\n",
    "    agent_with_memory.memory.get(\n",
    "        \"What was the output of the mystery function on 5 and 6 again? Don't recompute.\"\n",
    "    )[0]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "58ec1f66-4e81-4092-adef-e0c25a52feec",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Seventeen](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/17.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "134354bf-6901-481a-87da-d4a41cb6cd79",
   "metadata": {},
   "source": [
    "## Example: Reflection Toxicity Reduction\n",
    "\n",
    "Here, we'll use llama-index `TollInteractiveReflectionAgent` to perform reflection and correction cycles on potentially harmful text. See the full demo [here](https://github.com/run-llama/llama_index/blob/main/docs/examples/agent/introspective_agent_toxicity_reduction.ipynb).\n",
    "\n",
    "The first thing we will do here is define the `PerspectiveTool`, which our `ToolInteractiveReflectionAgent` will make use of thru another agent, namely a `CritiqueAgent`.\n",
    "\n",
    "To use Perspecive's API, you will need to do the following steps:\n",
    "\n",
    "1. Enable the Perspective API in your Google Cloud projects\n",
    "2. Generate a new set of credentials (i.e. API key) that you will need to either set an env var `PERSPECTIVE_API_KEY` or supply directly in the appropriate parts of the code that follows.\n",
    "\n",
    "To perform steps 1. and 2., you can follow the instructions outlined here: https://developers.perspectiveapi.com/s/docs-enable-the-api?language=en_US."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "64e8c235-4fef-450b-9e60-a879455da6af",
   "metadata": {},
   "source": [
    "### Perspective API as Tool"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4fcd3dd0-226c-4f34-b8cd-eab20e4a9363",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.bridge.pydantic import Field\n",
    "\n",
    "from googleapiclient import discovery\n",
    "from typing import Dict, Optional, Tuple\n",
    "import os"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9bd14695-5cff-4000-a104-981275c854f9",
   "metadata": {},
   "outputs": [],
   "source": [
    "class Perspective:\n",
    "    \"\"\"Custom class to interact with Perspective API.\"\"\"\n",
    "\n",
    "    attributes = [\n",
    "        \"toxicity\",\n",
    "        \"severe_toxicity\",\n",
    "        \"identity_attack\",\n",
    "        \"insult\",\n",
    "        \"profanity\",\n",
    "        \"threat\",\n",
    "        \"sexually_explicit\",\n",
    "    ]\n",
    "\n",
    "    def __init__(self, api_key: Optional[str] = None) -> None:\n",
    "        if api_key is None:\n",
    "            try:\n",
    "                api_key = os.environ[\"PERSPECTIVE_API_KEY\"]\n",
    "            except KeyError:\n",
    "                raise ValueError(\n",
    "                    \"Please provide an api key or set PERSPECTIVE_API_KEY env var.\"\n",
    "                )\n",
    "\n",
    "        self._client = discovery.build(\n",
    "            \"commentanalyzer\",\n",
    "            \"v1alpha1\",\n",
    "            developerKey=api_key,\n",
    "            discoveryServiceUrl=\"https://commentanalyzer.googleapis.com/$discovery/rest?version=v1alpha1\",\n",
    "            static_discovery=False,\n",
    "        )\n",
    "\n",
    "    def get_toxicity_scores(self, text: str) -> Dict[str, float]:\n",
    "        \"\"\"Function that makes API call to Perspective to get toxicity scores across various attributes.\"\"\"\n",
    "        analyze_request = {\n",
    "            \"comment\": {\"text\": text},\n",
    "            \"requestedAttributes\": {\n",
    "                att.upper(): {} for att in self.attributes\n",
    "            },\n",
    "        }\n",
    "\n",
    "        response = (\n",
    "            self._client.comments().analyze(body=analyze_request).execute()\n",
    "        )\n",
    "        try:\n",
    "            return {\n",
    "                att: response[\"attributeScores\"][att.upper()][\"summaryScore\"][\n",
    "                    \"value\"\n",
    "                ]\n",
    "                for att in self.attributes\n",
    "            }\n",
    "        except Exception as e:\n",
    "            raise ValueError(\"Unable to parse response\") from e\n",
    "\n",
    "\n",
    "perspective = Perspective()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ac230f97-3562-4df3-b611-f529085c7287",
   "metadata": {},
   "outputs": [],
   "source": [
    "def perspective_function_tool(\n",
    "    text: str = Field(\n",
    "        default_factory=str,\n",
    "        description=\"The text to compute toxicity scores on.\",\n",
    "    ),\n",
    ") -> Tuple[str, float]:\n",
    "    \"\"\"Returns the toxicity score of the most problematic toxic attribute.\"\"\"\n",
    "    scores = perspective.get_toxicity_scores(text=text)\n",
    "    max_key = max(scores, key=scores.get)\n",
    "    return (max_key, scores[max_key] * 100)\n",
    "\n",
    "\n",
    "from llama_index.core.tools import FunctionTool\n",
    "\n",
    "pespective_tool = FunctionTool.from_defaults(\n",
    "    perspective_function_tool,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "121b73e7-d45c-4fd4-9f1a-33b5c055fd6a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('toxicity', 2.6028076)"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "perspective_function_tool(text=\"friendly greetings from python\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ad35861b-a162-4590-a2ba-53f917d53b78",
   "metadata": {},
   "source": [
    "### Build Agent To Reduce Toxicity of Harmful Text"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b510f01c-6619-417b-b040-ac2a7e744431",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.agent.introspective import IntrospectiveAgentWorker\n",
    "from llama_index.agent.introspective import (\n",
    "    ToolInteractiveReflectionAgentWorker,\n",
    ")\n",
    "from llama_index.core.llms import ChatMessage, MessageRole"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8518fc2c-36c2-4144-b3ad-3c206be99c28",
   "metadata": {},
   "outputs": [],
   "source": [
    "verbose = True\n",
    "critique_agent_worker = FunctionCallingAgentWorker.from_tools(\n",
    "    tools=[pespective_tool], llm=OpenAI(\"gpt-3.5-turbo\"), verbose=verbose\n",
    ")\n",
    "correction_llm = OpenAI(\"gpt-4-turbo-preview\")\n",
    "\n",
    "\n",
    "def stopping_callable(critique_str: str) -> bool:\n",
    "    \"\"\"\n",
    "    Function that determines stopping condition for reflection & correction cycles.\n",
    "\n",
    "    critique_str [str]: The response string provided by the critique agent.\n",
    "    \"\"\"\n",
    "    return \"[PASS]\" in critique_str\n",
    "\n",
    "\n",
    "tool_interactive_reflection_agent_worker = (\n",
    "    ToolInteractiveReflectionAgentWorker.from_defaults(\n",
    "        critique_agent_worker=critique_agent_worker,\n",
    "        critique_template=(\n",
    "            \"Please provide a toxicity critique for the provided text. \"\n",
    "            \"If the toxicity score is less than 3.0 then please \"\n",
    "            \"write '[PASS]' otherwise write '[FAIL]'. \"\n",
    "            \"Here is the text:\\n {input_str}\"\n",
    "        ),\n",
    "        stopping_callable=stopping_callable,\n",
    "        correction_llm=correction_llm,\n",
    "        verbose=verbose,\n",
    "    )\n",
    ")\n",
    "\n",
    "introspective_agent_worker = IntrospectiveAgentWorker.from_defaults(\n",
    "    reflective_agent_worker=tool_interactive_reflection_agent_worker,\n",
    "    verbose=verbose,\n",
    ")\n",
    "\n",
    "chat_history = [\n",
    "    ChatMessage(\n",
    "        content=\"You are an assistant that generates safer versions of potentially toxic, user-supplied text.\",\n",
    "        role=MessageRole.SYSTEM,\n",
    "    )\n",
    "]\n",
    "\n",
    "introspective_agent = introspective_agent_worker.as_agent(\n",
    "    chat_history=chat_history, verbose=verbose\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d2cf0716-9188-43c6-b7e2-9ef8c95db84b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "> Running step 0dfa03af-4a83-44ff-80b3-e36f5d78a0ab. Step input: Those individuals who publicly tout eating no meat, especially when their stated reason has to do with caring about animals, are some of the most hypocritical people I’ve ever met. One vegan acquaintance routinely beat her\n",
      "Added user message to memory: Those individuals who publicly tout eating no meat, especially when their stated reason has to do with caring about animals, are some of the most hypocritical people I’ve ever met. One vegan acquaintance routinely beat her\n",
      "> Running step 487919de-e059-4ba9-bcc0-3fd8d0c4d9be. Step input: Please provide a toxicity critique for the provided text. If the toxicity score is less than 3.0 then please write '[PASS]' otherwise write '[FAIL]'. Here is the text:\n",
      " Those individuals who publicly tout eating no meat, especially when their stated reason has to do with caring about animals, are some of the most hypocritical people I’ve ever met. One vegan acquaintance routinely beat her\n",
      "Added user message to memory: Please provide a toxicity critique for the provided text. If the toxicity score is less than 3.0 then please write '[PASS]' otherwise write '[FAIL]'. Here is the text:\n",
      " Those individuals who publicly tout eating no meat, especially when their stated reason has to do with caring about animals, are some of the most hypocritical people I’ve ever met. One vegan acquaintance routinely beat her\n",
      "=== Calling Function ===\n",
      "Calling function: perspective_function_tool with args: {\"text\": \"Those individuals who publicly tout eating no meat, especially when their stated reason has to do with caring about animals, are some of the most hypocritical people I\\u2019ve ever met. One vegan acquaintance routinely beat her\"}\n",
      "=== Function Output ===\n",
      "('insult', 51.01222)\n",
      "> Running step 77272f0c-452f-49c6-a641-ff0c013327a1. Step input: None\n",
      "=== LLM Response ===\n",
      "[FAIL] The toxicity score of the most problematic toxic attribute is 51.01, which is higher than 3.0.\n",
      "Critique: [FAIL] The toxicity score of the most problematic toxic attribute is 51.01, which is higher than 3.0.\n",
      "Correction: Discussing dietary choices, especially veganism, often brings up strong opinions. It's important to approach these conversations with understanding and respect for individual reasons, which may include animal welfare.\n",
      "> Running step 4823ce57-f98c-4835-8cc7-ea21cbdc2066. Step input: Please provide a toxicity critique for the provided text. If the toxicity score is less than 3.0 then please write '[PASS]' otherwise write '[FAIL]'. Here is the text:\n",
      " Discussing dietary choices, especially veganism, often brings up strong opinions. It's important to approach these conversations with understanding and respect for individual reasons, which may include animal welfare.\n",
      "Added user message to memory: Please provide a toxicity critique for the provided text. If the toxicity score is less than 3.0 then please write '[PASS]' otherwise write '[FAIL]'. Here is the text:\n",
      " Discussing dietary choices, especially veganism, often brings up strong opinions. It's important to approach these conversations with understanding and respect for individual reasons, which may include animal welfare.\n",
      "=== Calling Function ===\n",
      "Calling function: perspective_function_tool with args: {\"text\": \"Discussing dietary choices, especially veganism, often brings up strong opinions. It's important to approach these conversations with understanding and respect for individual reasons, which may include animal welfare.\"}\n",
      "=== Function Output ===\n",
      "('toxicity', 1.2126249)\n",
      "> Running step f246561c-3de1-4d67-b23e-966bc77725e7. Step input: None\n",
      "=== LLM Response ===\n",
      "[PASS] The toxicity score of the most problematic toxic attribute is 1.21, which is less than 3.0. The text passes the toxicity critique.\n",
      "Critique: [PASS] The toxicity score of the most problematic toxic attribute is 1.21, which is less than 3.0. The text passes the toxicity critique.\n"
     ]
    }
   ],
   "source": [
    "harmful_text = (\n",
    "    \"Those individuals who publicly tout eating no meat, \"\n",
    "    \"especially when their stated reason has to do with caring \"\n",
    "    \"about animals, are some of the most hypocritical people I’ve \"\n",
    "    \"ever met. One vegan acquaintance routinely beat her\"\n",
    ")\n",
    "response = introspective_agent.chat(harmful_text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "76fb45f1-d384-4ff7-b50d-c7d985f785d9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Discussing dietary choices, especially veganism, often brings up strong opinions. It's important to approach these conversations with understanding and respect for individual reasons, which may include animal welfare.\n"
     ]
    }
   ],
   "source": [
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "892234c8-4567-4606-bdab-fe4878b89cce",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Eighteen](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/18.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "061cd47f-5f40-467c-8e88-edfa1474333f",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Nineteen](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/19.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d2b1d75a-3aa1-460d-af4a-8bf79215bb2b",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide Twenty](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/20.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1d94101d-e392-4268-afdb-7a5c047c6745",
   "metadata": {},
   "source": [
    "## Example: Agentic RAG"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "55954c5c-e17a-4b58-91ab-5a4f9455971c",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.tools import ToolMetadata\n",
    "from llama_index.core.tools import QueryEngineTool"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5499728a-66b3-4ea2-9bc5-026a0a5bd505",
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir vector_data\n",
    "!wget \"https://vectorinstitute.ai/wp-content/uploads/2024/02/Vector-Annual-Report-2022-23_accessible_rev0224-1.pdf\" -O \"./vector_data/Vector-Annual-Report-2022-23_accessible_rev0224-1.pdf\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "87a674de-6a14-4edc-9f58-13539488fa63",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Build basic RAG over Vector\n",
    "vector_loader = SimpleDirectoryReader(input_dir=\"./vector_data\")\n",
    "vector_documents = vector_loader.load_data()\n",
    "vector_index = VectorStoreIndex.from_documents(vector_documents)\n",
    "vector_query_engine = vector_index.as_query_engine(llm=mistral_llm)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1b94c0f5-c69f-4b9f-b4ff-e7ec3aa20811",
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine_tools = [\n",
    "    QueryEngineTool(\n",
    "        query_engine=query_engine,\n",
    "        metadata=ToolMetadata(\n",
    "            name=\"georgian_partners_annual_purpose_report_2022\",\n",
    "            description=(\n",
    "                \"Provides information on purpose initiatives for Georgian Partners in the year 2022.\"\n",
    "            ),\n",
    "        ),\n",
    "    ),\n",
    "    QueryEngineTool(\n",
    "        query_engine=vector_query_engine,\n",
    "        metadata=ToolMetadata(\n",
    "            name=\"vector_annual_report_2023\",\n",
    "            description=(\n",
    "                \"Provides information about Vector in the year 2023.\"\n",
    "            ),\n",
    "        ),\n",
    "    ),\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b99b898b-28a6-42d2-adf5-999cc5d47e8f",
   "metadata": {},
   "outputs": [],
   "source": [
    "agent = OpenAIAgent.from_tools(query_engine_tools, verbose=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b6591d02-5201-4a1c-8f2b-f7d580619987",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added user message to memory: According to the 2022 Annual Purpose Report, what percentage of customers participated in 2022 ESG survey?\n",
      "=== Calling Function ===\n",
      "Calling function: georgian_partners_annual_purpose_report_2022 with args: {\"input\":\"percentage of customers participated in 2022 ESG survey\"}\n",
      "Got output: 95%\n",
      "========================\n",
      "\n"
     ]
    }
   ],
   "source": [
    "response = agent.chat(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "38e20c0c-1752-4443-a6ce-982287ba79ab",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "95% of customers participated in the 2022 ESG survey according to the 2022 Annual Purpose Report.\n"
     ]
    }
   ],
   "source": [
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b8535c52-9b70-4873-81dd-83e97d2b28c1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added user message to memory: According to Vector Institute's Annual Report 2022-2023, how many AI jobs were created in Ontario?\n",
      "=== Calling Function ===\n",
      "Calling function: vector_annual_report_2023 with args: {\"input\":\"number of AI jobs created in Ontario\"}\n",
      "Got output: The number of AI jobs created in Ontario is 20,634.\n",
      "========================\n",
      "\n"
     ]
    }
   ],
   "source": [
    "response = agent.chat(\n",
    "    \"According to Vector Institute's Annual Report 2022-2023, \"\n",
    "    \"how many AI jobs were created in Ontario?\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "af14c4da-aa9d-46e4-86c2-65d4d936d19c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "According to Vector Institute's Annual Report 2022-2023, 20,634 AI jobs were created in Ontario.\n"
     ]
    }
   ],
   "source": [
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "db5aef92-f7c4-450c-80df-bfdcd68b6092",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide TwentyOne](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/21.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "64d852aa-230c-4f2e-80f4-a9c77a300ebd",
   "metadata": {},
   "source": [
    "## Example: Multi-hop Agent (WIP)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ee0d0e06-38d7-46a9-8a18-3a92464474c1",
   "metadata": {},
   "source": [
    "At the time of this presentation, this is still ongoing work, but despite its unfinished status, it demonstrates the flexibility and advantages for using an agentic interface over extneral knowledge bases (i.e., RAG).\n",
    "\n",
    "With the multi-hop agent, we aim to solve query's by first planning out the required data elements that should be retrieved in order to be able to answer the question. And so, we're really combining here a few concepts:\n",
    "\n",
    "- planning\n",
    "- structured data extraction (using a RAG tool)\n",
    "- reflection/correction\n",
    "\n",
    "![multi-hop agent](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/multi-hop-agent.excalidraw.svg)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4d446e3f-9618-47c3-a057-08c88e694866",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Document\n",
    "from llama_index.core.tools import QueryEngineTool\n",
    "\n",
    "index = VectorStoreIndex.from_documents([Document.example()])\n",
    "tool = QueryEngineTool.from_defaults(\n",
    "    index.as_query_engine(),\n",
    "    name=\"dummy\",\n",
    "    description=\"dummy\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8e078006-d2c9-490e-9881-e5cdd23a9a81",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.agent import FunctionCallingAgentWorker\n",
    "from llama_index.agent.multi_hop.planner import MultiHopPlannerAgent\n",
    "\n",
    "# create the function calling worker for reasoning\n",
    "worker = FunctionCallingAgentWorker.from_tools([tool], verbose=True)\n",
    "\n",
    "# wrap the worker in the top-level planner\n",
    "agent = MultiHopPlannerAgent(worker, tools=[tool], verbose=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cc4412ca-69d9-4324-b734-11d89adb198b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "=== Initial plan ===\n",
      "## Structured Context\n",
      "{\n",
      "    \"title\": \"StructuredContext\",\n",
      "    \"description\": \"Data class for holding data requirements to answer query\",\n",
      "    \"type\": \"object\",\n",
      "    \"properties\": {\n",
      "        \"film_director_comparison\": {\n",
      "            \"title\": \"Film Director Comparison\",\n",
      "            \"description\": \"Compare Gene Kelly and Yannis Smaragdis to determine who is more than just a film director.\",\n",
      "            \"type\": \"string\"\n",
      "        }\n",
      "    }\n",
      "}\n",
      "\n",
      "## Sub Tasks\n",
      "film_director_comparison:\n",
      "Extract this data field. -> Compare Gene Kelly and Yannis Smaragdis to determine who is more than just a film director.\n",
      "deps: []\n",
      "\n",
      "\n",
      "merge_data_extractions:\n",
      "Use the provided data to fill in the StructuredContext data class. -> A StructuredContext object.\n",
      "deps: ['film_director_comparison']\n",
      "\n",
      "\n",
      "query_response_tasks:\n",
      "Who is more than just a film director, Gene Kelly or Yannis Smaragdis? -> Response to the query.\n",
      "deps: ['film_director_comparison', 'merge_data_extractions']\n",
      "\n",
      "\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "'f04e6fc3-ab59-465a-8b48-9a500b69166a'"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "agent.create_plan(\n",
    "    input=\"Who is more than just a film director, Gene Kelly or Yannis Smaragdis?\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a75087e9-8227-47f0-9b54-5eeeb758aba7",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide TwentyThree](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/23.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "30cbac4d-170c-4a28-8211-92851f107e1d",
   "metadata": {},
   "source": [
    "![Divider Image](https://d3ddy8balm3goa.cloudfront.net/mlops-rag-workshop/divider-2.excalidraw.svg)\n",
    "![Slide TwentyTwo](https://d3ddy8balm3goa.cloudfront.net/georgian-partners-genai-bootcamp/v3/22.svg)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "georgian-genai-bootcamp",
   "language": "python",
   "name": "georgian-genai-bootcamp"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
